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ABSTRACT 

In A^-body simulations of coUisionless stellar systems, the forces are softened to reduce 
the large fluctuations due to shot noise. Softening essentially modifies the law of gravity 
at r = \xi — Xj\ smaller than some softening length e. Therefore, the softened forces 
are increasingly biased for ever larger e, and there is some optimal e which yields 
the best compromise between reducing the fluctuations and introducing a bias. Here, 
analytical relations are derived for the amplitudes of the bias and the fluctuations in 
the limit of e being much smaller than any physical scale of the underlying stellar 
system. In particular, it is shown that the fluctuations of the force are generated 
locally, in contrast to the variations of the potential, which originate from noise in the 
whole system. Based on these asymptotic relations and using numerical simulations, I 
study the dependence of the resulting force error on the number of bodies, the softening 
length, and on the functional form by which Newtonian gravity is replaced. The widely 
used Plummer softening, where each body is replaced by a Plummer sphere of scale 
radius e, yields significantly larger force errors than do methods in which the bodies are 
replaced by density kernels of finite extent. I also give special kernels, which reduce the 
errors even further. These kernels largely compensate the errors made with too small 
inter-particle forces at r < e by exceeding Newtonian forces at r ~ e. Additionally, 
the possibilities of locally adapting e and of using unequal weights for the bodies 
are investigated. By using these various techniques without increasing N, the rms 
force error can be reduced by a factor 2 when compared to the standard Plummer 
softening with constant e. The results of this study are directly relevant to A^-body 
simulations using direct summation techniques or the tree code for force evaluation. 
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1 INTRODUCTION 



describing the stellar dynamics, and 



A''-body techniques are applied to stellar dynamical prob- 
lems ranging from collision-dominated systems, like stellar 
clusters and galactic centres, to systems with extremely large 
numbers of particles, such as galaxies and large-scale struc- 
ture. In the first case, one is actually interested in simulating 
a system of A*' gravitationally interacting particles, which is 
done by colUsional A'^-body codes. 




(2) 




describing gravity. Here, f{x,v,t) is the (continuous) one- 



In systems with very large numbers of particles, on the 
other hand, the two-particle relaxation time greatly exceeds 
the age, i.e. these systems behave essentially coUisionless, 
very much like a continuous system. In other words, the shot 
noise is negligible, and each particle essentially moves in the 
mean force field generated by all other particles. Thus, the 
aim of any coUisionless A'^-body simulation is to simultane- 
ously solve the coUisionless Boltzmann equation (CBE) 



^ Correlations between particles (e.g. binaries) are not described 
by / but by the two-particle distribution function and higher 
orders in the BBGKY hierarchy (see, e.g., Binney &; Tremaine 
1987). 




0, 



(1) 
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N points {xi,Vi} sampled from the DF ai t = form a 
representative sample of / at each time t. 

The result of this procedure is a TV-body code, how- 
ever, with a crucial difference to coUisional A''-body codes: 
the bodies do not represent real particles, rather they are 
a Monte-Carlo representation of the underlying continuous 
DF. In fact, all the information on f{x,v,t) is given by the 
phase-space positions of the bodies. Since A'^ is much smaller 
than the number of particles in the system being modelled, 
this information is always incomplete. 



1.1.3 Estimating the true forces 

One may consider force softening as a way to obtain, at 
each time step, from the TV body positions an estimate for 
the gravitational forces generated by the continuous mass 
density of the system being modelled (Merritt 1996, see also 
below). In this context, softening essentially implies the as- 
sumption that the mass density is smooth on small scales. 
This assumption compensates for the aforementioned incom- 
pleteness in our knowledge of the DF, a standard method in 
non-parametric estimation techniques. 



1.1 The rationale of force softening 

Having solved for the dynamics, we still need to compute 
gravity. The simple-minded approach of replacing / in equa- 
tion (^) with a sum of 5-peaks at the body positions corre- 
sponds to a Monte-Carlo integration (Leeuwin et al. 1993) 



and yields the forces of TV mutually interacting particles. 

In practice, one softens the forces at small inter-body 
separations r — \x' — x\, which is equivalent to replacing 
the bodies by some extended mass distribution instead of 
5-function^. Since there appears to be some confusion for 
the proper reasons to introduce softening, I will now try to 
give a thorough motivation. 



1.1.1 Suppressing artificial relaxation? No! 

Contrary to widespread belief, softening d oes not mu ch re- 
duce the (artificial) two-body relaxation ( Theis 199^ ), the 
time-scale of which scales like TV/ In N and which cannot be 
neglected in simulations of coUisionless stellar systems. 

To see this, recall the works of Chandrasekhar and 
Spitzer, who showed that two-body relaxation is driven by 
close as well as distant encounters: each octave in distance 
is contributing equally. By softening the forces at small r, 
one reduces the contributions from close encounters. Most of 
the relaxation, however, is due to noise on larger scales, and 
hence cannot be avoided by softening techniques, in agree- 
ment with the numerical findings of Hernquist & Barnes 
(1990). 



1.1.2 Avoiding artificial two-body encounters 

Without softening, the accurate numerical integration of 
close encounters and binaries, the main obstacle in coUi- 
sional TV-body codes, requires great care and substantial 
amounts of computer time. However, as already mentioned, 
the bodies are just a representation of the one-particle DF, 
and consequently binaries as well as two-body encounters 
(and the Newtonian forces arising in them) are entirely ar- 
tificial. Thus, softening may be motivated by the desire to 
suppress such artifacts. 



In describing their dynamics, on the other hand, the bodies 
are treated as point-like test particles. Dyer & Ip (1993) argued 
that this asymmetry is physically inconsistent. However, bodies 
do not represent physical particles and no physical inconsistency 
can arise. The goal is not the simulation of N mutually interacting 
particles point-like or not. 



1.1.4 The benefits of softening 

As a benefit of softening, close encounters between bodies 
are much less of a problem: one can use the simple leap- 
frog integrator, a method that would be highly inefficient in 
coUisional TV-body codes. Moreover, since only an estimated 
force fields is computed, approximate, and hence cheaper, 
methods for its computation than direct summation may be 
used, as long as the errors introduced by the approximation 
are smaller than those due to the estimation itself. Both 
these effects lead to substantial savings of computer time, 
which in turn enable much larger numbers of bodies in col- 
lisionless than in coUisional TV-body codes (currently about 
3-4 orders of magnitude more). This may be considered the 
real motivation for softening. 



1.2 Optimal softening 

Softening also has a severe drawback: it artificially modifies 
mutual gravitational forces at small inter-body separations: 
instead of Newton's inverse square law, the forces vanish 
in the limit r — > (necessary to yield a continuous F{x)). 
With increasing degree of softening, this results in an ever 
stronger bias, the systematic error of the mean estimated as 
compared to the true forces, corresponding to a diminution 
of the information hold in the body positions. 

Thus, softening allows a trade-off between artificially 
high near-neighbour forces and a biased force field. Both do 
affect the fidelity but also the efficiency of a TV-body simula- 
tion. With little softening, the integration of the equations 
of motion requires great care, demanding substantial com- 
puter resources (if the time integration is not done carefully 
enough, two-body relaxation may be strongly enhanced and 
other unwanted effects may arise). On the other hand, the 
force bias introduced by softening modifies the dynamics, 
possibly on scales much larger than that on which forces 
are softened, which might deteriorate the simulation. The 
influence of these two effects on a TV-body simulation will 
certainly depend on (i) the purpose of the simulation, (ii) 
the properties of the system modelled and (iii) the time span 
integrated. This implies that the optimal way of softening 
depends on the nature of the problem investigated as well 
as the goal of the simulation. 

The idea of this paper is to make a first step towards 
answering the question of the optimal softening by investi- 
gating the effect of softening on the estimated force. That 
is, here we consider only the static problem of optimal force 
estimation. An investigation of the effect of softening on the 
dynamics is beyon d the scope of this paper, but see the dis- 
cussion in Section 6.3. 2 
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In some situations, one can reduce the noise by a careful 
rather than random cho ice of the traje ctories, the so-called 
"quiet start" technique (Sellwood 1983). However, if chaotic 
motion dominates most of the trajectories, e.g. in a merger 
simulation, the benefits of such a careful set-up quickly dis- 
appear. Here, I am interested in the general, i.e. worst, case, 
and assume throughout that the trajectories are sampled 
randomly from the DF. 



1.3 A formalism of softening 

A softening technique that is widely used is the Plummer 
softening, initiated by Aarseth's (1963) use of Plummer 
spheres to model galaxies in a cluster. In this method, each 
body is replaced by a Plummer sphere with scale radius e, 
resulting in the potential estimator (hereafter, an estimate 
for some quantity is indicated by a hat) 



4(03) = -G^ 



(3) 



where m = M /N is the weight (or mass) of each individual 
body. A more general estimator for the gravitational poten- 
tial is 



(4a) 



This formula clearly separates the two aspects of the soft- 
ening method: the softening kernel (j>{r), which determines 
the functional form of the modified gravity, and the softening 
length e. The Plummer softening, for instance, corresponds 
to (j> = (1 + r^)~^^^. The corresponding estimates for the 
force field F — — V"I> and the density p of the underlying 
smooth system are 



Fix) 



p{x) = 



where 



-G 



N 

Em 



JV 

Em 

1=1 



dr \ dr 



(4b) 



(4c) 



(5) 



is the kernel density, the mass distribution by which each 
body is replaced in these estimates. Note that kernels (p{r) 
which reproduce Newtonian gravity exactly for r larger than 
some finite radius ro correspond to density kernels of finite 
support, i.e. ri{r) = for r > tq. 

Following the terminology of statisticians (e.g.. Sil- 
verman 1986), the above approximations in equations (^ 
may be called fixed-kernel estimators because the softening 
length e is the same for all bodies. 



1.4 Outline of the paper 

The reader might be surprised by this introduction being 
rather void of quotations. In fact, even though A''-body 
methods with softening are widely used by astronomers, no 
detained investigation of the properties of force softening 



has yet been undertaken. Some papers deal with the small- 
scale truncation of the mutual forces, but only few consider 
this bias in conjunction with the reduction of noise. Mer- 
ritt & Tremblay (1994) dealt with estimators for the density 
of stellar systems, and pointed out the close connection to 
softening in A'^-body simulations. Merritt (1996) introduced 
the important concept of the mean square force error (see 
Section |^ below) and studied empirically the effect of the 
softening length on the errors of the forces estimated by 
Plummer softening. Athanassoula et al. (1998, 2000) have 
extended this work to larger A'^ and to using the tree code 
rather than direct summation as Poisson solver. 

A systematic investigation of the dependence of the 
force errors on the softening method, in particular for other 
than Plummer softening, is still missing, and the goal of 
this paper is to fill this gap. Merritt and Athanassoula et al. 
found, by brute force, empirical relations for the dependence 
on A'^ of the optimal softening length, topt, and the result- 
ing force error, mainly for the case of Plummer softening. 
Instead, I derive in Section ^ and Appendix^ of thispaper 
analytic expressions, valid for softening of the form (W), for 
the force error in the asymptotic limit of small e and large 
A*'. These asymptotic relations apply whenever e is smaller 
than any physical scale of the stellar system being modelled, 
a condition that must be satisfied for the A^-body simulation 
to be of any use. Based on the results of this investigation, 
I discuss in Section ^ the optimal softening length and ker- 
nel, and the asymptotic scaling with A'^ of topt and the error 
achieved, while numerical experiments that demonstrate and 
verify these analytic results are presented in Section |^. 

Apart from just increasing the number TV of bodies and 
of the choice of the softening kernel and length, the A''-body 
experimenter may use a scheme to locally adapt the soft- 
ening, and use different individual weights. These two latter 
techniques, which aim at enhancing the resolution in regions 
of higher density and thus decreasing the force error, are dis- 
cussed in Section ^ and related numerical experiments are 
presented in Section |5.3| . In Section ^, I discuss the relevance 
of the results of these investigations to other Poisson solvers 
than direct summation, and touch the question for the op- 
timal collisionless A'^-body code and related issues. Finally, 
Section |^ sums up and concludes. 



2 THE ERRORS OF THE ESTIMATES 

Let us consider the mean square error (MSB) of the estimate 
a{x) for some field a{x) 



MSB, (a) = na{x) ~ a{x)]' 



(6) 

where {) denotes the expectation value, the ensemble aver- 
age over all random realizations of the underlying smooth 
density distribution p{x) with TV positions. Elementary ma- 
nipulations yield 

MSB^ (a) = [bias^ (a)] ^ + var^ (a) (7) 
with 

biasa; (a) = ^a(a;)^ — a(a;), (8) 
var^a) = ^ [a{x) - (a(a^))]') = {&Hx)) - (aix))' . (9) 
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When applied to our problem, the bias is the deviation of the 
softened potential and forces in the continuum limit N ex 
from the true potential and forces of the underlying system. 
The variance, on the other hand, describes the amplitude of 
the random deviations of the estimated potential and force 
from their expectation value. Softening reduces the vari- 
ance but introduces a bias. Thus, there must be an optimal 
amount of softening between these two extremes (Merritt 

i99e|)~] 

2.1 Asymptotic behaviour of the bias 

The behaviour of bias and variance for small e is derived 



analytically in Appendix Al.l. For the biases of the softened 



potential ([fa|), force (^b|), and density (|4c|), one finds 
bias,($)= aoe'^Gp{x) + a2e^GAp{x) + Oie^), (10a) 
bias^(F) = -aoe^GVp(a;) - a2 e'^GVAp{x) + 0{e^), (10b) 
bias4p)= ^e'Apix) + ^e^AAp{x) + 0{e'') (10c) 

(e.g. Hernquist & Katz 1989). Here, Ofc are constants that 
depend only on the kernel: 



Ik = 



(fc + l)!./o 

(47r)2 



dr r 



--4>{r) 



(fc + 3)! 



dr r T?(r). 



(11a) 
(lib) 



Equations ( |10[ ) are based on the low orders of a Talyor ex- 
pansion of p{x — ez) around x. Thus, if the contributions 
of the higher-order terms are non-neglible within the soft- 
ening volume, these equations fail. Such a failure may oc- 
cur for various reasons. First, if e is not much smaller than 
the local physical scale of the underlying system, the soft- 
ening volume is large enough for the higher-order terms to 
contribute. Second, the underlying stellar system may have 
power on all physical scales, for instance, in a density cusp. 
In such a case, the Taylor expansion does not converge, but 
the bias is usually smaller than equations ( p^ ) predict (see 
equations (A7)). 

Finally, equations ( p^ ) fail if the kernel density rj{r) 
approaches zero not fast enough at large r. In this case, 
the integrals in equations ( |ll] ) do not exist, and the bias (i) 
cannot be described by local quantities and (ii) grows faster 
than e^. Hereafter, kernels with existing ag will be called 
compact. At large radii, the density of a compact kernel is 
either identical zero or falls off more steeply than . 

For compact kernels with positive definite ri{r), < 
ao < oo and these biases have some notable properties. First, 
they increase only like e^. Second, the bias of the potential is 
proportional to the density and hence everywhere positive, 
i.e. gravity is always under-estimated. Third, the biases are 
smaller for more compact kernels, as measured by ao- And, of 
course, the biases are independent of A*', since they describe 
the deviation from the smooth underlying model in the limit 
N <x (this last point holds for any kernel). 



2.2 Asymptotic behaviour of the variance 

Assuming the N positions Xi are independent, the variances 
of various estimates are times the variance due to the 



estimate obtained from just one body, and, by virtue of the 
central-limit theorem, the distribution of the estimates is 
nearly normal. For the a symptotic behaviour at small e I 
derive in Appendix 



A1.2 



A^var,(-i')=var°($)-fe<i>G^Mep(a;) + ©(e^), 
Avar,(F) =6fG^ Me~V(a^) + ^(e"), 



A^vara; (p) — bp M e~ 



'p{x) + 0{e-'), 



(12a) 
(12b) 
(12c) 



where only the trace of the force variance is given, which to 
lowest order is proportional to the unit matrix. Here, 



var!^($) = G^M 



pjy) 

\x - y\ 



(13) 



is the potential's variance per body in the absence of any 
softening, while the constants 



his>=A% I dr \^ — r'^4?{r) 



(14a) 



bp = 4:11 j dr r (f> (r) = (47r) / dr r 7;(r) ^(r), (14b) 
Jo Jo 

/•oo 

bp =47r / dr r'^ri\r) (14c) 
Jo 

depend on the kernel only. 

Even though equations ( p^ ) are of paramount impor- 
tance for the understanding of potential, forc e and density 
estimation, it appears that only equation (12c) has been de- 
rived previously (e.g. Silverman 1986), because density es- 
timation is a widely applied and well-established technique. 
There is a fundamental difference between the behaviours 
of the variance of the potential on the one side and that 
of the force or density on the other side. In the limit of 
vanishing softening, the variance of the potential becomes 
A~^varl^(4>), which is a finite global measure, while the vari- 
ances of force and density diverge in the limit of e — > and 
depend on the local density. As a consequence, many of the 
textbook results on density estimation will carry over into 
force estimation - though with changes in detail due to the 
different dependence on e as e ^ - and hence be useful for 
the Af-body field. 



3 THE OPTIMAL FORCE ESTIMATE 

3.1 The mean integrated square error 

So far, we have considered the error of the estimates at one 
point X. In order to assess the global error introduced, one 
usually averages the local mean square error over the whole 
stellar system to obtain the MISE (mean integrated square 
error, e.g. Silverman 1986, Merritt 1996) 



MISE(a) = M"^ j A^x p{x) MSE^ 



(a). 



(15a) 



This can be divided into the integrated squared bias (ISB) 
and the integrated variance (IV): MISE=ISB-|-IV, where 



ISB(a) = M"^ j d?x p{x) [bias^(a)]^ 
IV(a) = M"^ / <fx p{x) var^(a). 



(15b) 
(15c) 
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We now apply the asymptotic relations of Section ^ in order 
to obtain expressions for the mean integrated force error, 
MISE(i^ ), in the asymptotic limit of small e and large TV. 



values of A*', the smaller coefficient seems to give Plummer 
softening an edge). 



3.2 The optimal softening length 

3.2.1 Compact kernels with ao 7^ 

For compact kernels w ith rj > 0, we have ao 7^ and in- 
serting the estimates (10b) and (12b) into equations ( [l5| ) 
gives 



3.2.3 Kernels with ao = 



MISE(F) 



1 -1 



(16) 

(17a) 
(17b) 



with 

A = G'A'/-^ / d^'a; p{x) [Vp{x)] ^ 
B = G^ Jd^x p^{x), 

which depend on the underlying system. In the asymptotic 
relation (llq) the dependences of MISE(F) on the number 
A'^ of bodies, the softening length e, the kernel function, and 
the underlying stellar system are nicely separated. Choosing 
e such as to minimize MISE(i^) gives 

eopt ~ 4-1/^ {B/Af/'' {bp/alf'" iV-l/^ (18a) 

MISEopt(f ) ~ 4-''/^5 {AB^f^^ {bjr ag)'/^ N"^^"" . (18b) 

We also find that at optimal softening IY{F) = 4ISB(f ), 
i.e. the contributions to the MISE(i^ ) from variance and bias 
are of similar size. In the asymptotic regime, the MISE(i^) 
depends on the kernel through the combination (fe^ag)^''^ 
only, which in Table ^ is given as Ep. 

As an example, consider the estimation of the forces in a 
Plummer sphere using the Fi softening kernel (see Table 
Using the above relations and the (analytic) values of A and 
B for a Plummer sphere with unit mass and scale radius, 
we find (G = 1) 



MISEopt(f ) ~ 0.58 X Af-*/^ 



3.2.2 Plummer softening 

For Plummer softening, oq (equation |l^) does not exist, be- 
cause the softening kernel is not compact, i.e. in the sense 
of equations ( [lo| ) the softening volume (as measured by ao) 
is infinite. As a consequence, the bias increases faster than 
. For centrally concentrated systems, I found from numer- 
ical experiments that^ISB(J') oc e^ "^ at e ~ 0.01 scale radii, 
both for a Plummer sphere and a Hernquist model as target. 
Using this empirical relation, we get MISEopt (f') oc N~^''^'^ 
in good agreement with the numerical results of Athanas- 
soula et al. (2000), who find for 10^ < Af < 3 x 10^ and a 
Plummer sphere as underlying model: 

MISEopt(f ) « 0.45 X Ar-°'^^ 

This relation can be directly compared to that given above 
for spline softening. Evidently, with ever larger TV Plummer 
softening becomes ever less effective than compact kernels 
in reducing the MISE(i^) (for small but historically relevant 



As we have seen in equation (10b), the bias in the estimated 
force is, to lowest order, equal to — Ge'^aoVp. One may actu- 
ally estimate the density in order to subtract this bias. This 
procedure, however, is completely equivalent to using in the 
f irst place a kernel (f){r) for which ao = 0. From equation 
( lib ) , it follows that such a kernel corresponds to a density 
rj{r) that must be negative at some r. For such a kernel. 



MISE(f ) « A' aa -f B hp N'^e' 
with 

A' = Af-i Jd^x p{x) [VAp(a;)] 



(19) 



(20) 

and B given in equation ( |l7h| ) above. The softening that 
minimizes MISE(F) yields 

eopt « 8-^/^ (BM')'^' {bF/aiy/" N'^^^ (21a) 

MISEopt(F) ~ 8-^/^9 {A' B^f' {hUif' N'^^\ (21b) 

This time, the dependence of the optimal MISE(i^) on the 
kernel can be boiled down to the constant {b%-a%Y^^ , which 
is given in Table ^ as Ep. For the Plummer sphere as un- 
derlying model, the kernel Ki (see Table |^), equation (21b) 
gives for large TV (in the same units as above) 

MISEopt(f ) « 1.00 X N-^'^. 



3.3 The choice of the kernel 

After these considerations, we can answer the important 
question of the optimal kernel 4>{r). Below I list criteria, 
based on the above consideration and decreasing in impor- 
tance, which should be satisfied by the kernel (j). 

1. In order to reduce artificial small-scale noise, the esti- 
mated force must be everywhere continuous, i.e. the ker- 
nel must have a harmonic core^, i.e. 0' oc r for r ^ e. 

2. In order to retain Newtonian gravity as close as possible 
with the above restriction, the force bias should be as 
small as possible. With the above results, this implies 
that the kernel should be compact, or, even better, have 
(j) = r~^ (equivalent to = 0) for r > ro, where ro is of 
the order of one. 

3. The kernel should not only yield continuous forces ev- 
erywhere, but also continuous first, and possibly second, 
force derivative. This is to facilitate accurate integration 
of the equations of motion. 

4. The kernel and its derivative should be easily evaluable. 

5. The kernel should result in the smallest possible mean 
integrated square forece error, MISE(f'). 



^ The reader may compare this with Fig. 4 of Athanassoula et 
al. (1998), which clearly gives a slope very near to 3, even though 
the authors of that paper use bias^ <x as asymptotic relation. 



* Actually, a super-harmonic core with <f>' oc r"(for r <C e) with 
integer ra > 1 is also acceptable, but gives a larger bias. 
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Table 1. Some 3D softening kernels and their properties 



name 


kernel density •r](r) 


^max F 


0max 


0.0 


012 


6$ 


hp 


bp 


Ep 


Plummer 


3 1 
An (l + r2)5/2 


U. iUr 


n Q Q /I non 
0.384900 


00 


00 


Z7r 


37r2 
4 


45 
1024 




cubic spline 


\ T^y^ ~ ~^ -^^ I r < i 
< 4^(2-r)3 l<r<2 
[0 r > 2 


0.828302 


0.657817 


37r 


177r 


11207897r 


7001677 


491 


9.84 


T 


"eo" 


450450 


17325 


126077 


Fo 




1 


1 


2-K 

5 


7r 
70 


727r 
35 


2477 

5 


3 

477 


9.60 


Fi 




0.745356 


1.242260 


27r 


7r 

126 


4007r 
231 


4077 


15 
14^ 


9.65 


F2 


— (1_ ,.2)2^(1 _ ^2) 
327r 


0.592614 


1.637096 


27r 
"9" 


7r 
T98 


19607r 
1287 


280077 
429 


35 
22^ 


9.71 


F3 


Hi(l_r2)3^(l_^2-) 
647r 


0.505871 


2.051564 


2tv 

TT 


7r 

Tse 


6350471 
46189 


1764077 
2431 


315 

14377 


9.75 




15 /K 7„2n rr,--, 2n 

Ibn 






n 
U 


7r 
T26 


1707r 
231 


1 (Itt 
lUTT 


75 

Te^ 


Q 4*^ 


Ki 


Hi(5-9r2)(l-.2)^(l_,2) 
D47r 


0.493924 


3.436176 





77 

"T98 


2807r 
429 


I6IO77 
143 


525 
88^ 


9.48 


K2 


-^(5~llr2){l-r2)2H(l-r=^) 

iZoTT 


0.419491 


4.296037 





77 

^286 


2734277 
46189 


3024077 
2431 


9135 

114477 


9.54 


K3 


^'^^'^ (5 13r2){l r'^fHil r') 
10247r^ /\ ; V ; 


0.370935 


5.170685 





7r 

~390 


277277 
5083 


5682677 
4199 
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The kernels Fn (Ferrers (1877) spheres of index n) and K„ are continuous in the first n force derivatives {H denotes the Heaviside 
function). The potential of the Plummer kernel is <j>(r) = l/Vl + r'^ , that of the cubic spline kernel is given by Hernquist & Katz 
(1989), while for the kernels F„ and K„, the potentials are low-order polynomials in and can be found in Appendix r,„axF is 
the radius, in units of e, of the maximal force, while 0j„ax = (''max F ) ■ The constants ag and a2 are related to the bias introduced 
by the softening (equations while the constants 6^, 6^, and bp are related to the variances (equations p^ ). Ep (defined in the 
text) is a measure for the efficiency of the kernel: for fixed TV and optimal e, the mean integrated squared force error, MISE(^'), is 
directly proportional to Ep, though with a different constant of proportionality for compact non-negative kernels (spline and F„) and 
compensating kernels (K„), respectively. 



Here, the exact order of the last three points is debatable. 
For instance, a kernel that results in higher-order continu- 
ity at the price of being much more expensive to evaluate 
and/or of resulting in much worse a force error is not rec- 
ommendable. Regarding point 4, it might be interesting to 
note that for kernels that are polynomials in r^, all mutual 
forces can be computed in 0{N) operations (provided all 
mutual interactions are softened). 



3.3.1 Kernels with non-negative density 

It is instructive to check the most widely used kernels against 
this list. All of them satisfy the most important criterion 
1, but the very popular Plummer softening already fails 
at point 2: it is non-compact and produces a large bias, 
i.e. gravitity is modified not o nly at small distances, but 
on all scales (see Section 3.2.2| ), which also results in large 
a MISE(F) (point 5). The homogeneous-sphere (=Ferrers 
sphere of index n = 0) softening proposed by Pfenniger & 
Friedli (1993) has only continuous forces, but discontinuous 
force derivative and thus fails at point 3 (but satisfies 4), 
while the popular spline softening introduced by Monaghan 
& Lattanzio (1985) in the context of smoothed particle hy- 
drodynamics and advocated by Hernquist & Katz (1989) 
also for force estimation fails in the least important points 4 



and 5 (the latter follows from its efficiency Ep being some- 
what larger than for other compact kernels, see Table 0)^ 

Thus none of the popular softening kernels satisfies all 
criteria! Table ^ list these three kernels together with fur- 
ther possible kernels. The Ferrers (1877) sphere kernels of 
integer index n > may be derived (see Appendix ^) as the 
simplest kernels (in the sense of being low-order polynomials 
in r^) that satisfy the first four of the above criteria and are 
continuous in the nth force derivative. 

Figure ^ plots the potential (top), force (middle), and 
density ( bottom) of these kernels and of Newtonian gravity. 
The softening lengths e are scaled such that the maximial 
force, i.e. the artificial force scale introduced by softening 
(Newtonian gravity is scale free), is the same for all kernels. 
Obviously, Plummer softening gives by far the worst force 
approximation to Newtonian gravity (this holds for other 
ways of scaling e, too). The Fo (=homogeneous sphere) soft- 
ening gives the best force approximation, but at the price 
of an ugly discontinuity at r = e. The next best approxima- 



^ The rather complex functional form of the potential for the 
spline softening kernel originates from the necessity, in hydro- 
but not stellar dynamics, of an everywhere continuous second 
derivative of the density. An alternative kernel with this property 
is the n = 3 Ferrers sphere, F3 in Table hi see also Appendix M. 
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Figure 1. Potential, force, and density for Plummer and spline 
softening kernels as well as the Ferrers (1877) sphere kernels F„. 
The softening lengths e are scaled such that the maximal force 
equals unity. 



Figure 2. Like Figure |lj, but for the kernels K„ and the Plummer 

kernel. Near their outer edge, these first kernels have negative 
density and forces that exceed those of Newtonian gravity. As a 
result, the under-estimation of the force field due to too small 
forces near the origin is largely compensated. 



tion to Newtonian gravity is the Fi kernel (in the field of 
density estimation also known as "Epanechnikov" kernel). 
If its discontinuous second force derivative is a problem, one 
should use the F2 kernel, while the spline kernel does not 
satisfy point 4 of the above list and also compares some- 
what worse to Newtonian gravity than does the F2 kernel 
(but see footnote ^). 

One may want to derive the kernel that minizes the 
asymptotic force error, i.e. {b%ao), subject to positivity and 
continuity. However, such an optimal kernel is likely to be 
only marginally better than, say, the Fi kernel (which is op- 
timal for one-dimensional density estimation, see Silverman 
1986), at the price of having a significantly more complex 
functional form. 



3.3.2 Kernels with bias compensation 

In Appendix ^, some simple kernels are derived which are 
low-order polynomials in r^, yield oo = 0, and satisfy all 
points in the above list. These kernels, listed in Table |l] as 
K„, have forces with continuous nth derivative. Figure ^ 
plots their potential, force, and density in comparison with 
Newtonian gravity and the Plummer kernel. While in the 
inner parts these kernels are harmonic, i.e. have vanishing 
forces like the kernels shown in Fig. |l|, their forces exceed 
Newtonian gravity in the outer parts where their density is 
negative. This largly compensates the under-estimation of 
gravity in the inner parts and thus reduces the bias. As we 
shall see, these kernels prove to be superior for the purpose 
of force estimation, while for density estimation they are less 
useful because of their rj{r) not being non-negative. 



4 ADAPTIVE SOFTENING LENGTHS AND 
INDIVIDUAL WEIGHTS 

To further improve the force estimation one may use locally 
adapted softening lengths and/or individual weights. Both 
techniques aim at enhancing the resolution in regions of high 
phase-space density by reducing e or increasing the number 
density of bodies in such regions. Here, I will formally assess 
the effect of these possibilities on the force estimation. 

4.1 Definitions 

4.1.1 Individual weighting 

The formalism behind this is to draw the initial positions 
{xi,Vi} not from the DF f{x, v) modelled, but from a sam- 
pling DF /s, which is normalized to the same total mass M 
as /. In this case, the bodies have individual weights 

nii^MN'^fii, fj,i = fi{xi,Vi) (22) 

with the weight function 

fi{x,v) = f{x,v)/f^{x,v), (23) 

i.e. lighter bodies are used to compensate for oversampling, 
corresponding to fs being larger than /, and vice versa. 

A potential problem with this method is an enhanced 
artificial heating of the population of low-mass bodies, e.g. 
modelling a stellar disk, by two-body interactions with high- 
mass bodies, e.g. modelling a dark halo. In order to overcome 
this problem, I propose below to use larger individual soft- 
ening lengths for more massive particles. 
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4.1.2 Adaptive softening 

For density estimation (see Silverman 1986), the usual pro- 
cedure is to replace e in equations (^) by eXi, resulting in 
the adaptive kernel estimators (with individual weights nii): 



G \ ^ rUi 
G \ ^ m,; 



(24a) 
(24b) 



So, e no longer is a softening length, but the softening pa- 
rameter that controls the overall degree of softening. The \i 
are individual bandwidths, which are usually set to (Silver- 
man 1986, p. 101) Ai oc /5~" where pi is an estimate for the 
density at Xi. The sensitivity parameter a, a number satisfy- 
ing < a < 1, determines how sensitive the local softening 
length is to variations in the density. 

In the presence of individual weights, one may better 



Ai = fiY''^ {fii/ny 



(25) 



where fii are estimates of the local number density of bodies, 
while n is a normalization constant, hereafter the arithmetic 
mean of the fii. With this formula for the bandwidth, the 
maximal force in the neighbourhood of any body, i^max oc 
mi\~^ , is independent of m^. For a = and jii = 1 one 
obtains the standard fixed-kernel estimators (^ . 

The number of bodies within a softening volume is ap- 
proximately 



(26) 



r,r 47r 3 3 , , 47r 3 _3c, 3/2 ^/ s1-3q 

iVsoft ~ — e Ai n(a;) — e n n(x) 

Thus, the choice a — 1/3 results in j-L~^^^ Ngoit being ap- 
proximately constant over the entire system. In this case, 
one may, alternatively to equation (^^, determine the local 
softening length such that the number of bodies within the 
softening volume is exactly proportional to fi^^^ (or constant 
if the bodies are equally weighted). In this case, the (fixed) 
number fj.~^^^ Nboh is the softening parameter. 

In order to obtain the estimates hi for the number den- 
sities, one may use the estimator 



1^1 

1 ^ 
1 



\X — Xi\ 



(27) 



with the bandwidths A^ from the previous time stepg. 

If the method of individual weights is applied with a — 
0, equation (^5|) implies that the bodies have individual, 
though non-adaptive, bandwidths. Note that this is by no 
means the only way incorporate rUi into Xi - other ways may 
work as well, but here I only consider equation (pSj). 



° However, this method introduces a time asymmetry, which in 
turn can lead to unintended consequences. Therefore, one may use 
a simpler, and hence faster computable, estimate for the number 
density — it is actually not important that the estimates hi usd in 
equation ( |25| ) are very accurate (see Silverman 1986). In a tree- 
code, for instance, one may use the mean number density within 
cells containing a predefined number of bodies. 



Hernquist & Katz (1991) dubbed this sort of adap- 
tive softening the 'scattering' method. An alternative is the 
'gathering' method, in which the softening length depends 
not on the gravity source (and its position Xi), but on the 
sink. A considerable disadvantage of both the scattering and 
the gathering method is the implied violation of momentum 
conservation. However, momentum conservation can easily 
be resurrected by using the same softening length eij for 
the force of body i on body j and vice versa. Traditional 
choices are e^j — e^{Xi + A^) (particul arly useful in case of 



Plurn mer softening), eij = e(Ai-|-Aj)/2 ( Hernquist fc Barnes 



199C), or tij = e min(Ai,Aj). In a simulation of a galaxy 



group with each galaxy's stellar component represented by 
a single spheroidal particle, which was considerably more 
massive than the halo bodies and therefore had larger band- 
width, Barnes (1985) used eij — e max(Ai,Aj), to avoid 
strong heating by individual halo body interactions. 



4.2 Errors 

4.2.1 The bias 

The bias of the potential estimated using adap tive softening 
and individual weights is (see Appendices A2 and A3) 



biasa;(<l>) = ao Ge p[x) fi{x) 



n{x) 



(28) 



where Ji = Jd^vfp is the mass- weighted average of p of 
all trajectories passing through x. This result is independent 
of the method used to assign eij (it affects only the higher- 
order terms), while biasa!(i^) — Vbiasa,($). 

Let us first consider p. = 1 (equal weighting). In this 
case, n oc p and therefo re bi asa;(f') oc aoe^(l — 2a)p~^"Vp. 
Compared to equation (10b), the bias is reduced everywhere 
by the factor (1 — 2a) (for a = 1/2 we even have formally 
bias(F) = ©(e"), but in this case the derivation of equation 
( p8[ ) becomes inconsistent, see Appendix A2). Moreover, the 
factor [p(a;)/p]~^'^ decreases the bias in high-density regions 
and increases it in low-density regions, such that the density- 
weighted ISB(1<') is reduced. 

If in addition p ^ 1, the bias is further reduced. First 
n{x) is a steeper function than p{x) and hence is shal- 

lower resulting in a smaller gradient (^biasa, (J^)). Second, 
if the mean weight Jl is smaller in high-density regions, this 
produces a smaller gradient in the same way. 



4.2.2 The variance 



For the variance of the force, we obtain (see Appendices A2 



and 



A3) 



Nva.T^{F) = bpG^Me-^pix) p^/'^{x) 



n{x) 



+ 0(6°). (29) 



Let us again first consider the case of equal weights. 
Then n oc p, and vara;(i^) oc p^"^", i.e. in contrast to the 
bias, the variance is enhanced in high density regions, be- 
cause the softening length is reduced there. The correspond- 
ing increase in IV(i^) is smaller than the decrease in ISB(f'), 
because the density contrast p/p enters the integral for the 
evaluation of IV(i^) only to the power a, while in the evalua- 
tion of ISB(f ) it enters to the power of —4a. However, since 
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at the optimal softening, IV(i^) = 4ISB(i^), the decrease in 
the ISB(i^) must be more than four times the increase in 
the W{F), in order to yield an overall improvement in the 
optimal MISE(F) (which occurs at larger e than for the cor- 
responding fixed-kernel estimate). Due to this effect, we do 
not necessarily expect a dramatic improvement of the force 
estimation from adaptive rather than fixed kernel estima- 
tors. 

Consider next the situation for individual weights. We 
may replace n by p^.~^ to find that, comp ared to th e equal- 
weight case, the variance is multiplied by fj}/'^ . For pi 
not too far from unity, /i" ~ Jl'^ and the factor becomes 
~ 7I^/^~". That is, for a ^ 1/2, the variance of the force is 
reduced in regions of smaller than average weights, decreas- 
ing the global measure IV{F). 

Thus, when using individual weights in conjunction 
with adaptive softening, both variance and bias of the force 
are reduced in high-density regions and therefore also the 
mass-weighted averages IV(i^) and ISB(l^) as well as the 
overall force error MISE(F). 



5 NUMERICAL EXPERIMENTS 

So far, we have studied the properties of the softening purely 
analytically in the asymptotic limit of small e and large A*'. 
In order to study the behaviour at finite and to assess 
the applicability of the asymptotic relations, we will now 
turn to numerical experiments. To this end, we compute 
the bias and the variance of the force at a hundred radii, rt, 
equidistant in the enclosed mass, i.e. M(< r^) — (fc+i)/100, 
and estimate the ISB(i^) and IV(i^) as averages over these 
points. 



5.1 The targets 

The comparisons will be made for two spherically symmetric 
models. The first is the Plummer (1911) sphere with density 
and potential 



p(r) = 



3Mr? 



47r(ri +r2)5/2' 



<E>(r) = 



GM 



(30) 



(hereafter G = M = Vs = 1), which has the advantage 
that most quantities in the above estimates can be evaluated 
analytically. The Plummer sphere has a harmonic core with 
central frequency u = 1, i.e. tdyn ~ 27r, in these units. I will 
also use the Hernquist (1990) model 



p{r) 



Mr,, 



2-Kr(rs + r) 



$(r) 



GM 
rs +r 



(31) 



This model resembles the properties of galaxies better, since 
it has a central density cusp, where some of the asymptotic 
estimates made in Section |2| are not valid. 

The Hernquist model has a central force F — > —x/r 
as r = |a;| — > 0. Any softening yields a continuous force 
field and therefore inevitably gives (J^) = in the centre, 
resulting in a 100% central force bias. Since this error is 
restricted to the softening volume where the density scales 
as r~^ , its contribution to the overall ISB(f') scales as e^. 
Thus, because the ISB(i^) for a non-cusped system scales as 
e**, we expect the central bias to dominate the overall error 
for sufficiently small e, i.e. large A^. 



This is a generic problem of any cusped stellar sys- 
tem: the force field of a steep cusp cannot be resolved with 
any softening Moreover, a cusp creates further problems 
for Ai'-body simulations, since the dynamical time scale be- 
comes arbitrary small as r — » 0. This means that one cannot 
even hope to properly model a cuspy stellar system with any 
A'^-body method, but must accept a small artificial core with 
size of the order of the softening length^. Given this generic 
problem of all softening techniques, it makes sense to com- 
pare their performance only at r > e, i.e. ignore the problems 
at r — > 0. This is done automatically by our estimate of the 
ISB(i^), since the smallest radius for which we compute the 
bias is non-zero (it contains 0.005 of the mass), i.e. larger 
than e for small softening. 



5.2 Fixed kernel estimates 

5.2.1 Plummer sphere as target 

For most softening kernels of Table |l| (or Figures |l| and 
Figure ^ plots the MISE(i^ ), normalized by the averaged 
squared force 



(32) 



of the Plummer sphere, for A' = 10** bodies versus the scaled 
softening length 



[0'(j"niaxF)] 



-1/2 



(33) 



The effect of this scaling is that the maximal force generated 
by a body of mass m is m/e^, independent of the kernel (the 
distance from the body at which this maximal force occurs 
is different for the different kernels, see Figs. |^ and|^). 

In Figure the asymptotic behaviour at small and 
large softening lengths is immediately apparent. At small e, 
the integrated variance, TV{F), dominates the error budget, 
yielding a divergence as for e 0. At fixed e, Plummer 
softening yields the smallest IV{F), which is related to its 
long-range softening. 

At large e, the MISE(i^) is dominated by the integrated 
squared bias, ISB(i^), which increases steeply until e ~ 1, 
beyond which it saturates at MISE(j^) — {F'^), correspond- 
ing to a vanishing estimated force. As expected from our 
asymptotic results, the ISB(i^) grows like e** for the compact 
kernels with non-negative density (cubic spline & F„). For 
these kernels, the optimal scaled softening lengths and the 
resulting minimal MISE(f') are very similar. The latter is 
a consequence of the fact that the combination Ef ~ b^a^, 
through which the MISEopt(i^) depends on the kernel, does 
not vary much between those kernels (see Table 0). 



For density cusps steeper than that of the Hernquist model, i.e. 
p <x r ' as r — > with 7 > 1, which are observed in many early- 
type galaxies, the central force diverges resulting in a formally 
infinite force bias, unless e = 0. 

A steep power-law density cusp can be modelled down to 
r ^ O.le by settin g up initial y elocities in equilibrium with the 
softened potential (Barnes 1998). However, this amounts to mod- 
ifying the DF on scales r ^ e, not r ^ O.le. Thus, also in this 
approach the Af-body simulation does not match the stellar sys- 
tem modelled on a scale smaller than ^ e. 
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Figure 3. Fixed-kernel force estimate for most Isernels of Table |^ 
applied to N = 10* positions drawn from a Plummer sphere. 
In the top panel (a), the mean integrated squared force error, 
MISE(f'), normalized by the averaged squared force of the Plum- 
mer sphere, is plotted vs. the scaled softening length e. The middle 
and bottom panels (b and c) give the radial run of the biasa; (F) 
and the dispersion, [va.rx{F)]^^'^ , at the value of e that minimizes 
the MISE(f'). For comparison, the bold curve represents the true 
force in a Plummer sphere. 



For Plummer softening, the ISB(P) is much larger (up 
to one order of magnitude) and grows approximately like 
e^ ''. As a result, the optimal softening length, topt, is shorter 
(and hence the maximal force generated by any body larger) 
and the IVopt(i^) as well as MISEopt(f') are larger than for 



For the kernels K„, which have partly-negative ri{r) and 
ao = 0, the ISB(i^) is smaller than for the other kernels 
and grows like (for sufficiently small e). The IV (F) is 
only slightly larger than for the non-negative kernels, so that 
the minimal force error, MISEopt(f'), is significantly smaller 
than for other kernels and occurs at larger e, i.e. the maxi- 
mum force generated by a single body at optimal softening 
is smallest for these kernels. 

Figures ^ and 0c show the radial run of the bias and 
dispersion, [vara,(f')]^^'^, for the optimal softening, corre- 
sponding to the minima in Figure The lines for the 
four compact kernels with non-negative 77(7-) almost over- 
lay each other, as do the lines for the kernels K„. This is 
exactly, what is expected from the asymptotic relations of 
Section ti, because of the similarity of the factors Ef = b^ag 
and b%a2, respectively. In all cases, the dispersion is larger 
than the bias, in agreement with our expectations, based on 
the asymptotic relations, that IV(i^) = 4ISB(i^) (for com- 
pact kernels with ao 7^ 0). While in the inner parts of the 
Plummer sphere the bias contributes still significantly to the 
total error, in the outer parts the bias is neglible compared 
to the variance. This is true especially for the kernels K„, 
which were designed to reduce the bias. 



5.2.2 Hernquist model as target 

For N = 10" and 10^ Fi gure M shows the results of fixed- 
kernel force estimation applied to a Hernquist model. Com- 
pared with the situation for the Plummer sphere (Fig. ^) , the 
IV(i^) (i.e. the MISE at small e) is only marginally larger at 
any e, as expected from the asymptotic relation: after nor- 
malisation by averaged squared force, the coefficient B in 
equation (L7b) is 20% larger than for the Plummer sphere. 
The ISB(i^ ) on the other hand is significantly larger than 
for the Plummer sphere. This has the consequence that also 
the minimal MISE(J^) is larger and occurs at smaller e than 
for the Plummer sphere. In particular, at A'' = 10**, the rela- 
tive rms force error (MISEopt(f')/{-F'))^''^ is as large as 14%, 
and still about 5% at N ^ 10^. 

The differences in the MISEopt(f') between the various 
kernels are small at N = 10* but become clearer at N = 10^: 
the improvement in increasing N by an order of magnitude 
is significantly larger for the compact kernels, in particular 
K„, than it is for Plummer softening. 

As already discussed in Section the central cusp 
of the Hernquist model causes a severe difficulty for any 
A-body method. The innermost radius for which we com- 
pute the bias and variance is « 0.08, and we thus expect the 
estimated ISB(i^) to scale as for e larger than this. The 
corresponding change in the slope is evident in Fig. ^a. 

The radial runs of bias^(F) and [var^(F)]^/^ in Figs.Hb 
and c, respectively, are most interesting. In contrast to the 
situation for the Plummer sphere, the variance dominates 
at all body positions, except for the innermost few percent, 
especially for the kernels K,J^, where the cusp results in a 
large bias. 



compact kernels. 



^ This is related to a strange property of the Hernquist model: 
VAp, which in the asymptotic limi t is related to the force bias 
for these kernels (see equation ( |lOb| )), contains a part oc V5(a;). 
For finite softening length this is tapered to a function with some 
finite width of the order of e. 
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adaptive kernel estimates: the kernels for the force and den- 
sity estimators and their respective softening lengths as well 
as the sensitivity parameter a. However, as we have seen in 
the fixed kernel estimates, the Plummer kernel is clearly in- 
ferior to the other kernels discussed, and there is not much 
difference between the various positive definite compact ker- 
nels or between the kernels K„. Therefore, I will restrict the 
experiments to the Fi and Ki kernels for the force estima- 
tion, while the density is always estimated by the Fi ke rnel . 
Since the variance of the estimated density (equation 12c) 
increases much faster at small e than that of the estimated 
force, the optimal softening parameter for density estimation 
is always larger than that for force estimation. Therefore, I 
use twice the softening scale for the evaluation of p{xi) than 
in the force estimator. 



5. 3. 1 Plummer sphere as target 

For various choices of the sensitivity parameter a (equa- 
tion ^5^, Figures |5|a and ^ show the run with e of the 
MlSE(i^ ) obtained with the Fi and Ki kernels, respectively, 
from A'^ = 10* positions drawn from a Plummer sphere. The 
curves for a = Q are identical to those of the corresponding 
kernel in Fig. ^. Adapting the local softening improves the 
accuracy of the force estimation: at q = 1/3 the MISEopt(-F) 
has decreased by a factor of ~ 1.6, because the bias is sub- 
stantially reduced, while the variance has been only slightly 
increased, in agreement with the expectations from theory 



in Section 4.2 



Out of the numbers tested for a, the value 1/3 results 
in the best estimate; the case a = 1/2 will be discussed 
seperately in Section 5. 3. J . 



Figure 4. As Fig. ^ but for the Hernquist model as target with 
A'' = 10* (upper curves) and N = 10^ (lower curves). 



It is very instructive to consider the radial run of bias 
and variance at e = eopt in the lower panels of Figures ^ 
and ^ Compared to the fixed-kernel estimates (a = 0) , the 
adaptive-kernel estimates have reduced variance, while the 
run of the bias is much shallower, exceeding that for a = 
at large radii and running below it at small radii. Together 
this gives a better local balance between bias and variance 
and a local mean square force error, MSEa;(f') everywhere 
smaller than for the fixed-kernel estimators. 

It should be noted that, in case of the compensating 
kernel Ki, already a = 1/4 results in a significant error 
reduction. For a = 1/3 and a = 1/2 (here also for Fi), 
the bias changes sign. In these cases, the bias is a rather 
complicated function of radius, the asymptotic form of which 
has not been derived above. 



5.3 Adaptive kernel estimates 

Figures ^b and ^c nicely show the dilemma of fixed-kernels 
estimates: the large central biaScc(-F') requires, in order to 
achieve the global balance between ISB(J^) and W{F), a 
small softening length, which in turn results in a large 
vara; [iF) in the outer parts. It would be much better to bal- 
ance bias and variance locally, in order to obtain an optimal 
mean square force error at every position x. By choosing the 
local softening length in proportion to p~°, the adaptive ker- 
nel estimates do not quite reach this goal, but certainly go 
a good way in the right direction. 

There are many parameters which one may vary in 



5.3.2 Hernquist model as target 

Figures ||i and |a plot the MISE(f') versus e when applying 
the adaptive-kernel force estimator with, respectively, the Fi 
and Ki kernel to = 10* and 10^ positions drawn from the 
Hernquist mod el. A s anticipated from the asymptotic rela- 
tions in Section [Z!^ , the adaption reduces the ISB(i^) (dom- 
inating the MISE(i^ )at large e) substantially, much more 
than for the case of a Plummer sphere as target. However, 
the increase of the IV(i^) is also quite large, so that the 
MISEopt(i^) is reduced by about the same amount as for 
the Plummer target. A check with Figs. ^&c and ^&c 
shows that both ISB(i^) and IV(i^) are completely domi- 
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Figure 5. As Figure 
on the Fi kernel. T' 



but for an adaptive-kernel estimator based 
le softening scale of the density estimator 
has been set to twice that of the force estimator. Note that a 
sensitivity parameter of o = corresponds to the fixed-kernel 
estimator of Fig. 



nated by the large contributions from the innermost few per 
cent of the mass. 

Interestingly, there is a difference between the kernels 
Fi (with ao 7^ 0) and Ki (with ao = 0): for the latter the 
adaption does not reduce the MISEopt (F) as much as for the 
Fi kernel, and, as already found with the Plummer model 
as target (cf. Figure the difference between a — 1/4 and 
1/3 is marginal. 
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Figure 6. As Figure ^ but using the kernel Ki in the force esti- 



mator 



5.3.3 What is the optimal a? 

Special attention must be given to the case a = 1/2, for 
which the asymptotic relation (^8|) breaks down. The be- 
haviour both of the MISE(i^) and of the biaSii;(i^) become 
strange at this choice of the sensitivity parameter. For exam- 
ple, in Fig. 0, Q = 1/2 gives the smallest MISEopt (J'), but 
the bias reaches ~ 10% in the outer regions of the model, 
where it usually is negligible. In other cases, a = 1/2 results 
in MISE(i^) that behaves very odd as function of e, some- 
times even with two minima (see Fig. |^) , and in most cases 
do not yield the smallest force error found. 

The peculiar properties of q = 1/2 have been noticed in 
the context of density estimation a number of times (Scott 
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Figure 7. As Figure ^ but for an adaptive-kernel estimator based 
on the Fi kernel. The softening scale of the density estimator has 
been set to twice that of the force estimator. 



1992), in particular, that for large A'^ such estimates can 
be worse than fixed-kernel estimates - in agreement with 
Figure ^ This suggests that the behaviour of adaptive-kernel 
estimates with a = 1/2 is not well understood (even in the 
well-studied context of density estimation), and may thus 
be a topic for further investigations. 

In the face of these results, the choice a = 1/2 cannot 
be recommended. However, a value of 1/3 for the sensitivity 
parameter, corresponding to a roughly constant number of 
bodies within the local smoothing volume, seems to work 
well: it reduces the MISEopt(i^) by a factor of about 1.6, 
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Figure 8. As Figure ^ but using the kernel Ki in the force esti- 



mator 



the precise value depending on the circumstances (kernel & 
underlying model). 



6 DISCUSSION 

6.1 Relevance for other than direct-summation 
codes 

In our analysis, we have assumed that the forces are evalu- 
ated by direct summation, which is a rather CPU time in- 
tensive technique. Consequently, cheaper but approximate 
methods are favoured by most A'^-body eng ineers, with the 
exception of GRAPE (Ebisuzaki et al. 1993), which reaches 
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a higher performance by wiring equation (^) into the hard- 
ware. 

The force error arising from these approximate meth- 
ods, some of which combine softening and approximation 
into one process, follows not necessarily the relation (|l6|). 
Nonetheless, the MISE(f') is a function of because of 
var(P) cx A''"^. Of course, the force error also depends on 
the parameters of the method (including the softening length 
e) . Hence, there exist optimal values for these parameters in 
the sense of minimizing the MISE(f') at given N. 

This implies the necessity to change the parameters of 
the method when increasing A'^. However, this fact is often ig- 
nored by users of such techniques, who keep the parameters 
constant when increasing A*' and then claim their method 
needs only 0{N) or less operations for the force approxi- 
mation. With such a practice, the force error will eventually 
be dominated by the bias, and increasing A'' does not im- 
prove it at all. Moreover, the method becomes statistically 
inconsistent in the sense that in the limit N —> oo the esti- 
mated force does not converge to the true force. These last 
remarks apply in particular to the expansion and grid-based 
techniques discussed below. 



To see that Wmax oc A'^'^^ for the optimal force estimate, 
let us re-write equations (tMI) as weight-function estimate 



^x)=J2^^^W{x.,x), W{x,y)=J2r,'Pr.{x)'P„{y). (35) 

The effective local softening length e may be defined as the 
distance between zeros of the highest-order basis functions. 
While this varies with position and direction, it is obvious 
that e oc n~!,x- The bias of the estimated force may beesti- 



mated in close analogy to the derivation in Appendix A 1.1 



However, because W{x,x + z) is not reflection symmetric 
w.r.t. z = 0, the bias is 0(e) and not O(e^) as for kernel es- 
timates. With vara;(i^) oc 1/Ne we then find for the optimal 
softening n^ax oc N^^^ and MISEopt(i') oc N'^^^. 

Thus, the main problem of the expansion technique is 
its large bias, arising from a mismatch, in all but special 
situations, between the stellar system and the set of basis 
functions. In order to overcome this fundamental problem, 
one must ensure that the low-o rder basis functions represen t 
the stellar system fairly well (Hernquist & Ostriker 1992), 



for instance, by a utomatically ad apting them to the stellar 
system modelled (Weinberg 1999). 



6.1.1 The tree code 

An important method is the Barnes & Hut (1986) tree code 
and its clones: the force is evaluated by direct summation 
for nearby bodies only, whereas distant bodies are grouped 
together and their gravity is computed via a low-order mul- 
tipole expansion. This method re duces the com putational 
costs to 0{N In N) or even 0{N) ( [Dehnen 200o| ). 

The ignorance of the detailed structure of distant parts 
of the system introduces some additional errors and one ob- 
jective of the A'-body experimenter must be to keep these 
errors always smaller than those inherent to the force estima- 
tion (by softening). The results of Athanassoula et al. (1998) 
suggest that this condition is indeed satisfied for common- 
practice applications, which essentially implies that the re- 
sults of this paper can be directly applied to the tree code. 



6.1.2 Expansion techniques 

Another method is based on an expansion of density and 
potential in a set of bi-orthogonal basis functions, <l>n{x): 



(34) 



where the second equality follows from the bi-orthogonality 
relation 47r J d?x <Prn = Snm ■ The sum over n in equa- 

tion (^^ includes terms up to some maximum Timax and 
contains 0{n'^^^) terms in total. 

This technique is important for special situations, in 
which the stellar system is closely matched, at all times, 
by the low-order basis functions. For example, when study- 
ing the stability of equilibrium systems of axial or polar 



svmm etrv, this is the method of choice (Earn & Sellwood 



1995). For less specialized investigations, the stellar system 



is not always well matched by the low-order basis functions, 
and the technique is essentially useless, because in this case 
Timax oc A'^^'''' for Optimal force estimation, resulting in com- 
putational costs that are Oln'^^^ N) — 0{N^). 



6.1.3 Grid-based codes 

Grid-based methods for the force evaluation are also quite 
common. In these methods, one estimates the density (via 
a kernel-estimator) on the nodes of a regular grid, solves 
for the potential and forces on these nodes by some grid 
method such as fast Fourier transforms, and finally inter- 
polates the forces at the body positions. The density errors 
follow the relations of Section 0, but do not propagate triv- 
ially to the force estimate, because the grid method itself 
introduces further biases. Therefore, the detailled results of 
this paper may not be directly applied to grid-based meth- 
ods, but the general remarks made at the beginning of this 
Section still hold. 



6.2 Two-body relaxation 

As already explained in the introduction, softening cannot 
much reduce the artificial two-body relaxation. The analytic 
results of Section ^ actually support this, which can be seen 
as follows. 

Since relaxation is is driven by the graininess of grav- 
ity, it thus must be related to the variances of force and 
potential. In the limit e — > of Newtonian gravity, the vari- 
ations of the force actually diverge. However, in order to 
obtain the orbital changes one has to integrate the forces 
along the orbit, and this integral may no longer diverge. Ac- 
tually, this integral must be related to the variation of the 
potential, since $ — J F ■ dx. In contrast to the variance of 
the force, the variance of the potential is finite for Newto- 
nian gravity, and it derives from the contributions not only 
of nearby stars, like the variance of the force, but from all 
stars. The same is true for the two-body relaxation, which 
suggests that it is related to the variance of the potential - 
though the situation must be more complex, since dynam- 
ical considerations (time variations) have to be taken into 
account. 



From equation (12a), it is clear that in order to re- 
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Table 2. Scaling with A'^ of the eopt and MISEopt(-F) for some 
softening methods appHed to a Hernquist model 



kernel 


a 


MISEopt(F) 


^opt 




Plummer 





0.000296 X Afg"" ''^ 


0.017 X Afg" 


0.23 


Fi 





0.000208 X Afg""''^ 


0.051 X Afg" 


1/5 


Ki 





0.000153 X A^g"*'''* 


0.115 X A^g" 


1/9 


Fi 


1/3 


0.000125 X Afg"*/^ 


0.186 X Afg" 


1/5 


Ki 


1/3 


0.000102 X N~^^^ 


0.309 X Afg" 


1/9 



The MISEopt{^^) was evaluated numerically for the Hernquist 
model sampled with A^ = 10^ points (see Section A^s = . 
A sensitivity of o = corresponds to a fixed-kernel estimate, i.e. 
constanst e. 



duce the integrated variance of the potential substantially, 
we need 



_i J&'x p{x) var°($) 



(36) 



For the Plummer sphere as underlying model and, say, the 
Fi softening kernel, we get e ~ 1.1 scale radii, implying 
that, unless the softening length is comparable to the size 
of the system itself, the variance of the potential cannot be 
significantly reduced by means of softening. This suggests 
that the usual small-scale softening does not much reduce 
the two-body re laxat ion in agreement with the arguments 
given in Section 1.1.1. 



6.3 What softening technique should one use? 

Let us re-consider the scaling relations of the MISE(i^ and 
Eopt with A*'. For various softening techniques, Table H lists 
them in the case of A'^ ~ 10^ and the Hernquist model as 
target. The numbers given in this table are based on the 
numerical results of Section ^ and the scaling relations of 
Section |^ also summarized in Table ^ below. 



6.3.1 The softening method 

We first concentrate on simple fixed-kernel estimators. 
When the standard Plummer kernel is replaced by a com- 
pact kernel (e.g. Fi) or a compensating kernel (e.g. Ki), the 
MISEopt(J<') drops by a factor of ~ 1.5 and ~ 2, respectively. 
Note that such a change requires no overhead at all. Even 
codes using the cubic spline kernel of Monaghan & Lattanzio 
(1985) might benefit from employing the Fi or Ki kernel in- 
stead, since their computation involves less operations and 
the latter gives more accurate forces. 

Another improvement by a factor of ~ 1.5 can be 
achieved by adapting the softening lengths locally in such 
a way that the number A'^soft of bodies in the softening vol- 
ume is roughly constant (but dependent on A'^). This result 
for the Hernquist model as target applies to a lesser degree 
to stellar system with a smaller dynamic range, e.g. with a 
density core like the Plummer model. 



Table 3. Asymptotic scaling relations with e and N 
kernel bias(F) eopt MISEopt(J') 



Plummer 

compact 

compensating 



^1.67 



<x N' 



-0.23 



(X N' 



-0.77 



<x Af-1/9 



oc Af-8/9 



Note that vax(F) always scales as e ^, so the scaling of 

bias(P') makes all the difference. 



6.3.2 How to choose e? 

There are actually two questions behind this one, both of 
vital importance to the subject, neither of which has been 
addressed so far. First, how to find the value for e that min- 
imizes the MISE(i^) if the true underlying force field is un- 
known (e.g. during a A^-body simulation) and, second, is eopt 
derived in this way really the best choice? 

Merritt (1996) presumed that a time-intensive boot- 
strap algorithm would be needed to find topt if F{x) were 
unknown. However, if A'' is large enough for the asymptotic 
relations of Section ^ to hold, one may exploit these in order 
to find eopt with an overhead comparable to one full force 
computation for all bodies. The details of this method and 
some performance tests will be given in a follow-up paper. 

The second question is less technical and more difficult 
to answer. The error of the force comes in two parts, bias 
and variance. In the time mapping of the DF, these will 
give rise to different artificial effects. The variance results 
in enhanced relaxation, while the bias modifies the stellar 
dynamics; moreover there might be some interplay between 
the two. It is not clear and non-trivial to say, what the con- 
sequences of these modifications are and how they scale with 
simulation time. It is likely that the optimal e depends both 
on the stellar system modelled and on the specific goal of 
the simulation. If, for instance, the stability of some equi- 
librium configuration is to be investigated, it is essential to 
modify gravity as little as possible, i.e. use small e (and a 
compensating kernel)^. On the other hand, when more vio- 
lent dynamics is investigated (e.g. mergers) one might want 
to suppress small-scale noise. Obviously, these are important 
issues which deserve further investigations. 



6.4 The optimal collisionless N-hoAy code 

Merritt (1996) proposed to call a Poisson solver optimal if 
it results in the smallest MISE(f') for given A''. With this 
definition, all non-direct methods are sub-optimal (since, in 
addition to the softening, they invoke approximations, which 
introduce additional errors), and yet such methods are gen- 
erally preferred. Therefore, one may better define: The op- 
timal collisionless N-body code achieves, for given computer 
resources (CPU time and memory), the most faithful time- 
mapping of those properties of the DF that are essential for 
the purpose of the particular simulation. Note that in this 
definition, the number N of bodies does not occur, N is just 



In these circumstances, a technique other than softeninj 



10 

and should be used for variance reduction: the quiet start (Sell- 
wood 198^ ). In this method, the initial phase-space positions are 
setup essentially noise free and, if the motion is predominantly 
regular, remain so for several dynamical times. 
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a parameter of the code, as is the softening length. This 
means in particular, that contrary to widespread practice, 
the number A'^ alone is not a sufficient criterion for the qual- 
ity of a A^-body simulation. 

What is the optimal Poisson solver for general-purpose 
coUisionless A^-body codes? From the discussion in Sec- 
tion 3.1, it is clear that expansion techniques are not suit- 
able for a general applicable A'^-body code - unless, perhaps, 
Weinberg's (1999) automatic adaption method is continu- 
ously applied. Obviously, the optimal Poisson solver must be 
adaptive in space and time. This condition clearly favours 
the tree code, which naturally is fully adaptive, while for 
grid-based solvers a considerable effort is needed to make 
them fully adaptive. 

From the results of this paper, it appears to be obvi- 
ous that the optimal softening method uses adaptive soft- 
ening lengths and a compact, or even compensating, kernel. 
A fixed softening length may be appropriate in the case of 
a stellar system with a small dynamic range, i.e. with a 
constant-density core. 



6.5 On the resolution of a AT-body simulation 

In discussing TV-body simulations, it is important to know 
their resolution scale, i.e. the smallest scale on which a simu- 
lation is "believable". Lacking a more precise definition, one 
often simply uses the softening length e for the resolution 
scale. With the asymptotic relations in Table ^, this yields 
the absurd result that users of Plummer softening can im- 
prove the resolution of their simulations faster with increas- 
ing A^ than users of compact or compensating kernels, which 
we have shown to be superior to Plummer softening. How 
can this be? 

The problem is that linearly relating e with resolution 
scale is incorrect. The softening length e is a mere parame- 
ter without any physical meaning. A force resolution length 
may be properly defined as the scale over which the maxi- 
mum change in the true force equals the (local) error of the 
estimated force: 



(37) 



l|V®F(a;)|r 

With the results of Section ^ we find that, for optimally 
chosen e, the decrease of Sros with increasing A'^ is slowest 
for Plummer softening and fastest for softening with com- 
pensating kernels. In particular, the relation between e and 
Sress is uon-liuearQ 

Note, however, that Sios defined above is a mere mea- 
sure for the resolution of the force, and not of the believ- 
ability of the simulation. Since gravity is a long-range force, 
local errors in the DF, introduced on scales of Sres, may well 
propagate to larger scales. 



7 SUMMARY AND CONCLUSION 

In coUisionless A'-body simulations, the bodies are the nu- 
merical representation of the continuous one-particle DF. 



Note, however, that in a cusp F becomes discontinuous and 
equation invalid. In such a situation one has indeed Srcs e- 



Because A' is much smaller than in the stellar system be- 
ing modelled, this representation is always incomplete, i.e. 
noisy. In order to estimate, at each time step, the contin- 
uous force field due to the DF, one has to moderate this 
noise. This exactly is the purpose of force softening in coUi- 
sionless A'^-body codes. However, while reducing the noise in 
the A'^-body forces, the softening modifies the laws of grav- 
ity: the force vanishes at vanishing inter-body separation 
r —> 0. This generates a bias, a systematic offset between 
the estimated and true forces. Both, the noise and the bias, 
contribute to the mean square error of the estimated force 
field. While the noise decreases with the softening length e, 
the bias increases, and the opti mal softening represents a 
compromise between these two ( Merritt 1996 ). 

In the limit of e being small compared to any physical 
scale of the stellar system, a requirement that is desirably 
satisfied in any A''-body simulation, analytic expressions for 
the noise-generated variance of the force and for the bias 
have been derived. While the relations (^ for the bias have 
already been given in the literature (e.g. Hernquist & Katz 
1989), those for the variance of the estimated potential and 
force (equation seem to be hitherto unknown. One of 
the most important results is that the variance of the force 
is a local quantity diverging in the limit e — > 0, very much 
like the variance of the density, while the variance of the 
potential depends on a global measure and reaches a finite 
value for e = 0. Presumably, this is related to the fact, see 



Section 3.2 and Theis (1998), that the two-body relaxation 
is not substantially reduced by force softening. 

In these asymptotic relations, the effects of the soften- 
ing length, the softening kernel, the number A'' of bodies, 
and of the underlying stellar system nicely separate, allow- 
ing simple and, in the important case of large N, accurate 
estimates for the behaviour of the mean integrated squared 
force error, MISE(i^). For the various types of softening ker- 
nels. Table |^ gives the asymptotic relations for the opti- 
mal softening length and force errors (the relations given 
for Plummer softening are based on numerical experiments 
only) . The strongest possible dependence of the MISE on A'' 
is N~'^ as pertains, e.g., to parametric estimators. Note that 
the compensating kernels come very close to this limit. The 
main results of this investigation are: 

1. Compact softening kernels, i.e. those with finite density 
support, are superior to the standard Plummer softening. 
This improvement is ever larger for increasing A'^. 

2. Special softening kernels are derived, for which the force 
errors are even smaller. These kernels compensate the 
small forces at r — > by forces larger than Newtonian at 



3. For inhomogeneous stellar systems, such as galaxies, 
adaptive softening lengths allow a further substantial re- 
duction of the force errors. 

All these numbers are based on the Hernquist model, which 
gives a fair representation of a typical galaxy, as underlying 
stellar system, resulting in a rms force error of ~ 0.01 = 
0.04 of the rms force of this model. For smaller errors to be 
achieved, the advantage of using compensating kernels and 
adaptive softening lengths will become even larger. 

For this error of 4%, particle numbers A'^ ~ 10^ are 
necessary, and roughly 20 times as many to reduce the er- 
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ror to 1%. It is not clear, what these errors imply in terms 
of the reliability of the time evolution of a corresponding 
A'^-body simulation. This is a very important question to 
answer, in order to better understand the results of iV-body 
simulations and assess their reliability. For example, current 
A''-body simulations of large-scale structure formation em- 
ploy only about 10^~* bodies per halo, which come out to 
have a profile similar to a Hernquist model, i.e. the rms force 
error is 10% to 25%, even if e was chosen optimally, which 
it very likely was not. 
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APPENDIX A: ASYMPTOTIC RELATIONS FOR BIAS AND VARIANCE 
Al Fixed-kernel estimates with equal weights 

A 1.1 The bias 

In calculating the expectation value, we may replace the sum over points by an integral weighted with the density p. Thus 



G 



d y p{v) 



\x - y\ 



We proceed by writing 



e \e/ r e ir \e 
Inserting this into equation ( |A1| ) and substituting y = x — ez, we find 

($(a;)) = ^{x)+Ge^ Jd' 



p{x — ez) 



When replacing p{x — ez) by its Taylor expansion about x, 



(Al) 



(A2) 



(A3) 



(A4) 



we see that, by virtue of the symmetry of the kernel, the integrals over all odd orders in e vanish identically and one obtains 
bias^(<l) =aoe^Gp{x) + a2e'^GAp(x) + 0{e^) (A5) 
where 



47r 



(fc + 1)! 



f oc 

/ dr r'''^^ 


ri 

— ( 


P{r) 




.r 





47r 



(fc + l)!(fc + 3) 



dr r 



fc+3 



(fc + 3)! 



dr r ??(?■)• 



(A6) 



A similar estimate can be made for the bias of the force and density estimators in equations ( ^b[ ) and (^c[), but it is actually 
simpler to relate the biases directly using Poisson's equation. Note, that the integrals over tPz in equation (A3) may not 
converge if the kernel does not approach its asymptotic shape (j) 1/r fast enough at large radii. To be more precise, if the 
density kernel rj falls off as r~® or less steeply, then the biases grow faster than quadratically with small e. 

Furthermore, if the Taylor expansion in equation (A4) does not converge, the above estimate is incorrect, too, usually 
over-estimating the true bias. To see this, consider a spherical cusp i n th e density at y = 0, i.e. p = poy^'' (with 7 < 2) at 
small y. Inserting this instead of the Taylor expansion into equation (A3), we find for a; <C e 



biascusp ("i") ~G poe^ 47r 
biaScusp(-F') = -Gpo 7<:^"^ 47r 
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(A7a) 
(A7b) 



Here, the integrals extend only formally to infinity: for a sensible kernel, the term in brackets vanishes outside some small 
radius. Thus, for cusps shallower than 7=1, the bias in the force is still finite; for steeper cusps, the correct force actually 
diverges, which of course cannot be reproduced by the softening, resulting in a diverging force bias. 



Al. 



The variance 



Assuming the positions Xi are independent, the variance of some quantity is equal to A'^ ^ times the variance of the contribution 
of one point. Moreover, the central-limit theorem tells us that the estimates follow essentially a normal distribution. 

The variance of the estimated potential. With Ti = Xi — x, the contribution of one point is (jiiix) = —GM e"^ (f>{\ri\/ e). 
Thus, from equation (b|) we have 



iVvar,(<l) = {(fi{x)) - [$(a;) +bias,(-l)]'. 

We may evaluate {4>)1 as an average weighted with the density p: 



d^y p{y) 4>'^ 



We proceed by writing 



(A8) 



(A9) 



(AlO) 
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to finally obtain 
with 6* = 4tt dr [1 - r'^<p^{r)]. 



d% p^y^ 



(All) 



The variance of the estimated force. The contribution from one body is — ri/\ri\ G M e ^</>'(|ri|/e), and in analogy 
to equation (A_9), we have 



d^y p{v) 'i>' 



x-y\\ {x-y)®{x-y) 



Substituting y = x — ez and Taylor expanding p, we find for the variance 

var^(F) = G^e~' AfiV"'&Fp(a;) i| + 0(e°), 

where I denotes the unit matrix, while bp ~ 47r dr r'^ (f)'^ (r) . 

The variance of the estimated density. The contribution from one body is pi = M e~^'q{\ri\/e), and 

M^e-'^bpp{x)+0{e-'') 



(A12) 



(A13) 



with 6p = dr r^T]^ (r). 



(A14) 



A2 Adaptive softening with equal weights 



In analogy to equation (A3), we get for the expectation value of the estimated potential 



{Hx))=${x)+Ge' Id'z ^(^-") 



A 



- - 0(0 



r=|z|/A 



(A15) 



For equal weights, equation (^5|) reduces to A = [p(x — ez)/'p]~°'. In order to estimate the asymptotic behaviour at small e, 
we may replace p in this relation by p. For kernels with ao 7^ 0, the error made by this assumption is 0{e^) and hence the 
resulting error in the bias of the potential is a factor smaller than the leading term. We thus get 



bias,($) =aoGe'/"p(a;)i-'"+0(e*). 



(A16) 



which for a = reduces to the result for the fixed-kernel estimate. 

For a = 1/2, the quadratic term in bias(<i>) is constant and the biases of force and density are formally only Oie^). 
However, this was not anticipated in the approximation made above, where p in the definition of A was replaced by p. Thus, 
for this special case, the estimation might be incorrect. 

Because the variances are dominated by the lowest order in e, we may use the above relations for the fixed-kernel estimates 
with e replaced with e[p(x)/p]~". 

A3 Softening with individual weights 

In this case, the softening lengths eXi depend on the full 6D phase-space coordinates, and we must replace the density-weighted 
averaging over configuration space by a sampling-distribution-function-weighted averaging over phase space. 



A3.1 The bias 



With equation (A2) we obtain 



bias.(*) = ^j'^'vj /=(y.«) ^^^^ 
with 

\{y,v) = p{y,vf''^ [n(y)/n]"". 



r=\x—y\/tL\{y,-u) 



(A17) 



(A18) 

where p{y,v) is given in equation (jj^). In order to proceed, we first replace n by n in the definition of A in analogy to the 
equal- weight case, and second substitute y = x — eXz with \dy/dz\ = e^A^(a;, v ) + 0(e*2) + O(e^z^). We then finally get 



biasa; (<l>) = ao Ge^ p{x) p,{x) [n{x)/n\ 



0{e 



(A19) 



where p = p} with p" = p ^ Jd'^vfp"^ the mass-weighted average of p°' of all trajectories passing through x. 
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A3. 2 The variance 



For the force variance, we have instead of equation (A12) 



{x-y)®{x- y) 
(x-y)2 



When substituting y — x — e\z and using equation (A.18), we get for the force variance to lowest order in e 
va.r^{F) ^ M bp p{x)Ji^{x) [n{x)/n]" + 0(e°). 



(A20) 



(A21) 



Note that for q = these relations do not reduce to those given for fixed-kernel estimates, because, with our definition of 
(equation ba), individual weights also result in individual bandwidths. 



APPENDIX B: INVESTIGATIONS FOR BETTER KERNELS 

We wa nt to design a spherical kernel <jf>(r) that corresponds to a compact density ri{r) and satisfies all conditions listed in 
Section 3.3. Let us therefore assume that r]{r) — and (f){r) = for r > 1. Moreover, continuity of and the normalization 



of rj give us the constraints 
0'(O) = O, </<(l) = l, 



(1) 



(Bl) 



Let us now make an ansatz for (f> that, in order to satisfy point 4 of the list in Section 3.3, involves only powers of r for r < 1: 



for r < 1 



and 



otherwise. 



(B2) 



The constraints (Bl) require Co ~ 1 and Ci — ^, but do not require anything for the higher-order coefficients. When these 
are all set to zero, one obtains the homogeneous-sphere kernel. 



Bl Ferrers sphere kernels 



The kernel of the form (B2) with 



1 /2A- — 1 \ 

Co^l, Ci = i, Cfe = ^5^f ^ j for fc = 2,...,n + l (B3) 

is continuous in the first n force derivatives, or, equivalently, in the first (n — 1) derivatives of the density. If in addition Ck — 
for fc > n + 1, we obtain kernels that are Ferrers (1877) spheres of index n, denoted F„ in this paper. They have density 

where H denotes the Heaviside function. The first few are also known as homogeneous sphere (n = 0), Epanechnikov {n — 1) 
and biweight (n = 2) kernel and are listed in Table ^ 

B2 Compensating kernels 

Kernels with oq = (equation |ll|), i.e. with creating reduced bias, and continuity in the first n force derivatives can be 



obtained by setting Ck for A; > n + 1 as in equation ( B3 ) and 



^"+-= (n + 2)!(n + 3)l(2!! + 5)2^--^a ^ - ^ for > n + 2. (B5) 

These kernels, which are denoted K„ in this paper, may be considered as Ferrers spheres with a higher-order addition. They 
have density 

Vir) = ^^^,)ll:t2V.2^.,.^ (5 - [2n + 7^) d - - r^Y (B6) 

The properties of the first four kernels K„ are listed in Table |l|. 
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